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ABSTRACT 


Financial depression and then the resultant failure of 
a business is usually an extremely costly and 
disrupting event for any company and organization. 
Statistical predictions of financial depression models 
try to predict whether a business will experience 
financial failure in the future. Discriminate analysis 
and logistic regression have been the most popular 
approaches which is used, but there is also a large 
number of data mining and machine learning 
techniques that can be used for this purposed. In this 
paper, studied about the machine learning based 
bankruptcy prediction model which is used Decision 
tree, Naive bayes, Neural network and _ others 
classification algorithms. These algorithms have been 
applied to financial distress prediction and compared 
with each other as well as the most popular 
approaches in based on various metrics like accuracy, 
error rate and model building time. 


Keywords: Bankruptcy, Bankruptcy prediction 
system 


Classification, machine learning techniques, Credit 
Card Fraud Detection 


1. INTRODUCTION 


Techniques for predicting bankruptcy of companies and 
financial organization became an important issue in the 
days. Recently in India bankruptcy become a very hot 
topic in banking, social and political area. The high 
individual, economic, and social costs inherent in 
corporate failures or bankruptcies have prompted efforts 
to provide better insight into and prediction of 
bankruptcy events [1]. Given the radical change of 
globalization, more accurate forecasting of corporate 
financial distress would provide useful information for 
decision-makers, such as_ stockholders, creditors, 
governmental officials, and even the general public. In 
fact, corporate bankruptcies can be caused by many 
factors such as wrong investment decisions, a poor 
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investment environment, low cash flow and so on [1]. 
Therefore, the many current methods for predicting 
corporate failure must be continuously improved. 


The bankruptcy is a typical binary classification 
problem: there are only two results of prediction, 
bankruptcy and non bankruptcy. Up to now, many 
researchers have proposed some classical bankruptcy 
prediction models based on statistical methods [2] 
However, the validity of these traditional statistical 
methods mainly depends on the 


Subjective judgments of the human financial experts 
when 


Applied in the selection of some parameters which, in 
turn, 


Inevitably makes feature selection bias. With the 
development of data mining techniques, machine 
learning methods have been exploited by many 
researchers for the bankruptcy prediction problem since 
these methods can provide an unbiased feature selection 
and decision making mechanism. 


In this paper, different machine learning techniques are 
employed to predict bankruptcy. The support system can 
be utilized by stock holders and investors to predict the 
performance of a company based on the nature of risk 
associated. 


2. LITERATURE REVIEW 


In [1] author compare some traditional statistical 
methods for predicting financial distress to some more 
‘unconventional’? methods, such as decision tree 
classification, neural networks, and evolutionary 
computation techniques, using data collected from 200 
Taiwan Stock Exchange Corporation (TSEC) listed 
companies. Empirical experiments were conducted using 
a total of 42 ratios including 33 financial, 8 non-financial 
and 1 combined macroeconomic index, using principle 
component analysis (PCA) to extract suitable variables. 


Author [2], proposed a semi-parametric Cox survival 
analysis model and non-parametric CART decision trees 
have been applied to financial distress prediction and 
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compared with each other as well as the most popular 
approaches. This analysis is done over a variety of cost 
ratios (Type I Error cost: Type II Error cost) and 
prediction 


Intervals as these differ depending on the situation. The 
results show that decision trees and survival analysis 
models have good prediction accuracy that justifies their 
use and supports further investigation. 


The proposed [3] algorithm is successfully applied in the 
bankruptcy prediction problem, where experiment data 
sets are originally from the UCI Machine Learning 
Repository. The simulation results show the superiority 
of proposed algorithm over the traditional SVM-based 
methods combined with genetic algorithm (GA) or the 
particle swarm optimization (PSO) algorithm alone. 


In [4] researchers investigate the effect of sampling 
methods on the performance of quantitative bankruptcy 
prediction models on real highly imbalanced dataset. 
Seven sampling methods and five quantitative models 
are tested on two real highly imbalanced datasets. A 
comparison of model performance tested on random 
paired sample set and real imbalanced sample set is also 
conducted. The experimental results suggest that the 
proper sampling method in developing prediction models 
is mainly dependent on the number of bankruptcies in the 
training sample set. In this research, authors [5] propose 
the implementation of Jordan Recurrent Neural Networks 
(JRNN) to classify and predict corporate bankruptcy 
based on financial ratios. Feedback interconnection in 
JRNN enables to make the network keep important 
information well allowing the network to work more 
effectively. The result analysis showed that JRNN works 
very well in bankruptcy prediction with average success 
rate of 81.3785%. Neural Networks can process a 
tremendous amount of attribute factors; it results in over 
fitting frequently when more statistics is taken in. By 
using K-Nearest Neighbor and Random Forest, authors 
[6] obtain better results from different perspectives. 
Research [6] testifies the optimal algorithm for 
bankruptcy calculation by comparing the results of the 
two methods. 


3. PROPOSED WORK 


The framework proposed in this work is depicted in 
Figure 1. The proposed framework for prediction works 
for each transaction and separates the transaction with 
high or low risk using the method proposed. The 
proposed predictive model can be further used to 
generate alerts for transaction with high risks. 
Investigators check these alerts and provide a feedback 
for each alert, i.e. true positive (fraud) or false positive 
(genuine). The proposed model uses suitable pre- 
processing, attributes selection techniques along with 
proposed classification techniques. 
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Figure 1: Proposed model 


4. EXPERIMENTAL SETUP, METHODOLOGY 
3.1 Experimental Setup 


Weka 3.6.11 is used as DM tool for simulation purpose. 
Weka is installed over Windows 7 Operating System. 
For this research the data used in this study is obtained 
from sites:https://archive.ics.uci.edu/ml/machine- 
learningdatabases/00365/ that provided by University of 
California at Irvine (UCI). The data set consists 1000 
Polish companies. 19.4% companies went bankrupt 
during 2000-2012. Dataset description is presented in 
Table 1. 


Wataset No. of Total 

Be Features | Instances 
Bankruptcy 
(Polish 65 5910 
Compines) 


Table 1: Details of Dataset 
3.2 Methodology 
The experiment methodology involves following steps: 


Preprocessing of Dataset 
Applying Feature selection 
Applying new ML classifier 
Evaluate result 


Peal a 


4. CONCLUSION & FUTURE WORK 


This paper proposed a bankruptcy or financial distress 
prevention model based on machine learning techniques. 
Proposed system gives better accuracy and low error rate 
comparison to existing prediction model. In this survey 
also studied about the various algorithms and _ their 
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performance metrics. In future this model works on 
cloud and real time data. 
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